3 research outputs found
Deep Generative Models for Reject Inference in Credit Scoring
Credit scoring models based on accepted applications may be biased and their
consequences can have a statistical and economic impact. Reject inference is
the process of attempting to infer the creditworthiness status of the rejected
applications. In this research, we use deep generative models to develop two
new semi-supervised Bayesian models for reject inference in credit scoring, in
which we model the data generating process to be dependent on a Gaussian
mixture. The goal is to improve the classification accuracy in credit scoring
models by adding reject applications. Our proposed models infer the unknown
creditworthiness of the rejected applications by exact enumeration of the two
possible outcomes of the loan (default or non-default). The efficient
stochastic gradient optimization technique used in deep generative models makes
our models suitable for large data sets. Finally, the experiments in this
research show that our proposed models perform better than classical and
alternative machine learning models for reject inference in credit scoring
Learning Latent Representations of Bank Customers With The Variational Autoencoder
Learning data representations that reflect the customers' creditworthiness
can improve marketing campaigns, customer relationship management, data and
process management or the credit risk assessment in retail banks. In this
research, we adopt the Variational Autoencoder (VAE), which has the ability to
learn latent representations that contain useful information. We show that it
is possible to steer the latent representations in the latent space of the VAE
using the Weight of Evidence and forming a specific grouping of the data that
reflects the customers' creditworthiness. Our proposed method learns a latent
representation of the data, which shows a well-defied clustering structure
capturing the customers' creditworthiness. These clusters are well suited for
the aforementioned banks' activities. Further, our methodology generalizes to
new customers, captures high-dimensional and complex financial data, and scales
to large data sets.Comment: arXiv admin note: substantial text overlap with arXiv:1806.0253
Discriminative Multimodal Learning via Conditional Priors in Generative Models
Deep generative models with latent variables have been used lately to learn
joint representations and generative processes from multi-modal data. These two
learning mechanisms can, however, conflict with each other and representations
can fail to embed information on the data modalities. This research studies the
realistic scenario in which all modalities and class labels are available for
model training, but where some modalities and labels required for downstream
tasks are missing. We show, in this scenario, that the variational lower bound
limits mutual information between joint representations and missing modalities.
We, to counteract these problems, introduce a novel conditional multi-modal
discriminative model that uses an informative prior distribution and optimizes
a likelihood-free objective function that maximizes mutual information between
joint representations and missing modalities. Extensive experimentation
demonstrates the benefits of our proposed model, empirical results show that
our model achieves state-of-the-art results in representative problems such as
downstream classification, acoustic inversion, and image and annotation
generation